V18 Manifold-Guided Architecture — val_bpb 0.434 by raahilg · Pull Request #663 · openai/parameter-golf

raahilg · 2026-03-25T00:23:35Z

Standard language models must simultaneously construct an internal representation of token relationships and learn to navigate that representation to make predictions. We separate these two jobs.

By precomputing a physics-simulated token manifold from corpus co-occurrence statistics, we freeze the geometric structure directly into the architecture. The model's job changes from construction + navigation to just navigation — a much easier task that lets the weights specialize entirely on exploiting the geometric prior rather than building it from scratch.

The result is essentially a GNN operating on a precomputed token interaction graph — the manifold defines graph topology, sparsemax produces edge weights, and hop cells perform node updates with message passing. Every architecture decision is chosen to exploit this geometric prior: sparsemax routing along manifold geodesics, spectral-coordinate-conditioned attention, entropy-guided message passing, and parallel transport across the token manifold.

With only 1024 tokens, the full pairwise statistics are trivially computable — the manifold captures essentially the complete statistical structure of the language. Normally, a model would need to rediscover these patterns through gradient descent. We hand them to a 20M parameter model on initialization.

Rephrase sentence for clarity regarding model initialization.

Copilot

Pull request overview

Adds a new Parameter Golf submission (“V18 Manifold-Guided Architecture + Sparsemax Routing”) including the training script, run logs for two seeds, and the submission metadata/README describing results.

Changes:

Introduces a new train_gpt.py implementing manifold construction + sparsemax-routed multi-hop message passing + manifold-guided attention, with int8+zlib export and roundtrip eval.
Adds training logs for seed 42 and seed 27 runs (including post-quant BPB).
Adds submission.json and a README documenting the approach and reported metrics.

Reviewed changes

Copilot reviewed 3 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
records/track_10min_16mb/2026-03-24_V18_ManifoldGuided_Sparsemax/train_gpt.py	New training/manifold/quantization script for the V18 submission.
records/track_10min_16mb/2026-03-24_V18_ManifoldGuided_Sparsemax/train_seed42.log	Seed 42 training record and final int8+zlib roundtrip metrics.
records/track_10min_16mb/2026-03-24_V18_ManifoldGuided_Sparsemax/train_seed27.log	Seed 27 training record and final int8+zlib roundtrip metrics.
records/track_10min_16mb/2026-03-24_V18_ManifoldGuided_Sparsemax/submission.json	Submission metadata (score, size, blurb, author/date).
records/track_10min_16mb/2026-03-24_V18_ManifoldGuided_Sparsemax/README.md	Write-up of the method, results table, and run instructions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-25T17:48:54Z